Goto

Collaborating Authors

 extreme scenario


LLM-empowered Agents Simulation Framework for Scenario Generation in Service Ecosystem Governance

Zhou, Deyu, Hou, Yuqi, Xue, Xiao, Lu, Xudong, Li, Qingzhong, Cui, Lizhen

arXiv.org Artificial Intelligence

As the social environment is growing more complex and collaboration is deepening, factors affecting the healthy development of service ecosystem are constantly changing and diverse, making its governance a crucial research issue. Applying the scenario analysis method and conducting scenario rehearsals by constructing an experimental system before managers make decisions, losses caused by wrong decisions can be largely avoided. However, it relies on predefined rules to construct scenarios and faces challenges such as limited information, a large number of influencing factors, and the difficulty of measuring social elements. These challenges limit the quality and efficiency of generating social and uncertain scenarios for the service ecosystem. Therefore, we propose a scenario generator design method, which adaptively coordinates three Large Language Model (LLM) empowered agents that autonomously optimize experimental schemes to construct an experimental system and generate high quality scenarios. Specifically, the Environment Agent (EA) generates social environment including extremes, the Social Agent (SA) generates social collaboration structure, and the Planner Agent (PA) couples task-role relationships and plans task solutions. These agents work in coordination, with the PA adjusting the experimental scheme in real time by perceiving the states of each agent and these generating scenarios. Experiments on the ProgrammableWeb dataset illustrate our method generates more accurate scenarios more efficiently, and innovatively provides an effective way for service ecosystem governance related experimental system construction.


FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models

Jung, Dahyun, Lee, Seungyoon, Moon, Hyeonseok, Park, Chanjun, Lim, Heuiseok

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have significantly enhanced interactions between users and models. These advancements concurrently underscore the need for rigorous safety evaluations due to the manifestation of social biases, which can lead to harmful societal impacts. Despite these concerns, existing benchmarks may overlook the intrinsic weaknesses of LLMs, which can generate biased responses even with simple adversarial instructions. To address this critical gap, we introduce a new benchmark, Fairness Benchmark in LLM under Extreme Scenarios (FLEX), designed to test whether LLMs can sustain fairness even when exposed to prompts constructed to induce bias. To thoroughly evaluate the robustness of LLMs, we integrate prompts that amplify potential biases into the fairness assessment. Comparative experiments between FLEX and existing benchmarks demonstrate that traditional evaluations may underestimate the inherent risks in models. This highlights the need for more stringent LLM evaluation benchmarks to guarantee safety and fairness.


Extreme Scenario Selection in Day-Ahead Power Grid Operational Planning

Terrén-Serrano, Guillermo, Ludkovski, Michael

arXiv.org Machine Learning

We propose and analyze the application of statistical functional depth metrics for the selection of extreme scenarios in day-ahead grid planning. Our primary motivation is screening of probabilistic scenarios for realized load and renewable generation, in order to identify scenarios most relevant for operational risk mitigation. To handle the high-dimensionality of the scenarios across asset classes and intra-day periods, we employ functional measures of depth to sub-select outlying scenarios that are most likely to be the riskiest for the grid operation. We investigate a range of functional depth measures, as well as a range of operational risks, including load shedding, operational costs, reserves shortfall and variable renewable energy curtailment. The effectiveness of the proposed screening approach is demonstrated through a case study on the realistic Texas-7k grid.


ExGAN: Adversarial Generation of Extreme Samples

Bhatia, Siddharth, Jain, Arjit, Hooi, Bryan

arXiv.org Artificial Intelligence

Mitigating the risk arising from extreme events is a fundamental goal with many applications, such as the modelling of natural disasters, financial crashes, epidemics, and many others. To manage this risk, a vital step is to be able to understand or generate a wide range of extreme scenarios. Existing approaches based on Generative Adversarial Networks (GANs) excel at generating realistic samples, but seek to generate typical samples, rather than extreme samples. Hence, in this work, we propose ExGAN, a GAN-based approach to generate realistic and extreme samples. To model the extremes of the training distribution in a principled way, our work draws from Extreme Value Theory (EVT), a probabilistic approach for modelling the extreme tails of distributions. For practical utility, our framework allows the user to specify both the desired extremeness measure, as well as the desired extremeness probability they wish to sample at. Experiments on real US Precipitation data show that our method generates realistic samples, based on visual inspection and quantitative measures, in an efficient manner. Moreover, generating increasingly extreme examples using ExGAN can be done in constant time (with respect to the extremeness probability), as opposed to the exponential time required by the baseline approach.


Your Data Is Biased, Here's Why - InformationWeek

@machinelearnbot

Bias is everywhere, including in your data. A little skew here and there may be fine if the ramifications are minimal, but bias can negatively affect your company and its customers if left unchecked, so you should make an effort to understand how, where and why it happens. "Many [business leaders] trust the technical experts but I would argue that they're ultimately responsible if one of these models has unexpected results or causes harm to people's lives in some way," said Steve Mills, a principal and director of machine intelligence at technology and management consulting firm Booz Allen Hamilton. In the financial industry, for example, biased data may cause results that offend the Equal Credit Opportunity Act (fair lending). That law, enacted in 1974, prohibits credit discrimination based on race, color, religion, national origin, sex, marital status, age or source of income.


The Robots Will Make the Best Fake News

#artificialintelligence

Imagine that tomorrow, some smart kid invented a technology that let people or physical goods pass through walls, and posted instructions for how to build it cheaply from common household materials. Lots of industries would probably become more productive. Being able to walk through walls instead of being forced to use doors would make it easier to navigate offices, move goods in and out of warehouses and accomplish any number of mundane tasks. That would give the economy a boost. But the negative might well outweigh the positive.